Picture for Lei Hou

Lei Hou

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Add code
May 29, 2026
Viaarxiv icon

Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders

Add code
May 26, 2026
Viaarxiv icon

StoryAlign: Evaluating and Training Reward Models for Story Generation

Add code
May 06, 2026
Viaarxiv icon

MAIC-UI: Making Interactive Courseware with Generative UI

Add code
Apr 28, 2026
Viaarxiv icon

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Add code
Feb 09, 2026
Viaarxiv icon

MM-THEBench: Do Reasoning MLLMs Think Reasonably?

Add code
Jan 30, 2026
Viaarxiv icon

On the Paradoxical Interference between Instruction-Following and Task Solving

Add code
Jan 29, 2026
Viaarxiv icon

RPC-Bench: A Fine-grained Benchmark for Research Paper Comprehension

Add code
Jan 14, 2026
Viaarxiv icon

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Add code
Jan 09, 2026
Viaarxiv icon

WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection

Add code
Oct 21, 2025
Viaarxiv icon